This notebook continues your exploration of matrices.
Multiplying matrices is a little more complex than the operations we've seen so far. There are two cases to consider, scalar multiplication (multiplying a matrix by a single number), and dot product matrix multiplication (multiplying a matrix by another matrix.
To multiply a matrix by a scalar value, you just multiply each element by the scalar to produce a new matrix:
\begin{equation}2 \times \begin{bmatrix}1 & 2 & 3 \\4 & 5 & 6\end{bmatrix} = \begin{bmatrix}2 & 4 & 6 \\8 & 10 & 12\end{bmatrix}\end{equation}In Python, you perform this calculation using the * operator:
In [ ]:
import numpy as np
A = np.array([[1,2,3],
[4,5,6]])
print(2 * A)
To mulitply two matrices together, you need to calculate the dot product of rows and columns. This means multiplying each of the elements in each row of the first matrix by each of the elements in each column of the second matrix and adding the results. We perform this operation by applying the RC rule - always multiplying Rows by Columns. For this to work, the number of columns in the first matrix must be the same as the number of rows in the second matrix so that the matrices are conformable for the dot product operation.
Sounds confusing, right?
Let's look at an example:
\begin{equation}\begin{bmatrix}1 & 2 & 3 \\4 & 5 & 6\end{bmatrix} \cdot \begin{bmatrix}9 & 8 \\ 7 & 6 \\ 5 & 4\end{bmatrix}\end{equation}Note that the first matrix is 2x3, and the second matrix is 3x2. The important thing here is that the first matrix has two rows, and the second matrix has two columns. To perform the multiplication, we first take the dot product of the first row of the first matrix (1,2,3) and the first column of the second matrix (9,7,5):
\begin{equation}(1,2,3) \cdot (9,7,5) = (1 \times 9) + (2 \times 7) + (3 \times 5) = 38\end{equation}In our resulting matrix (which will always have the same number of rows as the first matrix, and the same number of columns as the second matrix), we can enter this into the first row and first column element:
\begin{equation}\begin{bmatrix}38 & ?\\? & ?\end{bmatrix} \end{equation}Now we can take the dot product of the first row of the first matrix and the second column of the second matrix:
\begin{equation}(1,2,3) \cdot (8,6,4) = (1 \times 8) + (2 \times 6) + (3 \times 4) = 32\end{equation}Let's add that to our resulting matrix in the first row and second column element:
\begin{equation}\begin{bmatrix}38 & 32\\? & ?\end{bmatrix} \end{equation}Now we can repeat this process for the second row of the first matrix and the first column of the second matrix:
\begin{equation}(4,5,6) \cdot (9,7,5) = (4 \times 9) + (5 \times 7) + (6 \times 5) = 101\end{equation}Which fills in the next element in the result:
\begin{equation}\begin{bmatrix}38 & 32\\101 & ?\end{bmatrix} \end{equation}Finally, we get the dot product for the second row of the first matrix and the second column of the second matrix:
\begin{equation}(4,5,6) \cdot (8,6,4) = (4 \times 8) + (5 \times 6) + (6 \times 4) = 86\end{equation}Giving us:
\begin{equation}\begin{bmatrix}38 & 32\\101 & 86\end{bmatrix} \end{equation}In Python, you can use the numpy.dot function or the @ operator to multiply matrices and two-dimensional arrays:
In [ ]:
import numpy as np
A = np.array([[1,2,3],
[4,5,6]])
B = np.array([[9,8],
[7,6],
[5,4]])
print(np.dot(A,B))
print(A @ B)
This is one case where there is a difference in behavior between numpy.array and numpy.matrix, You can also use a regular multiplication (*) operator with a matrix, but not with an array:
In [ ]:
import numpy as np
A = np.matrix([[1,2,3]
,[4,5,6]])
B = np.matrix([[9,8],
[7,6],
[5,4]])
print(A * B)
Note that, unlike with multiplication of regular scalar numbers, the order of the operands in a multiplication operation is significant. For scalar numbers, the commmutative law of multiplication applies, so for example:
\begin{equation}2 \times 4 = 4 \times 2\end{equation}With matrix multiplication, things are different, for example:
\begin{equation}\begin{bmatrix}2 & 4 \\6 & 8\end{bmatrix} \cdot \begin{bmatrix}1 & 3 \\ 5 & 7\end{bmatrix} \ne \begin{bmatrix}1 & 3 \\ 5 & 7\end{bmatrix} \cdot \begin{bmatrix}2 & 4 \\6 & 8\end{bmatrix}\end{equation}Run the following Python code to test this:
In [ ]:
import numpy as np
A = np.array([[2,4],
[6,8]])
B = np.array([[1,3],
[5,7]])
print(A @ B)
print(B @ A)
An identity matrix (usually indicated by a capital I) is the equivalent in matrix terms of the number 1. It always has the same number of rows as columns, and it has the value 1 in the diagonal element positions I1,1, I2,2, etc; and 0 in all other element positions. Here's an example of a 3x3 identity matrix:
\begin{equation}\begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{bmatrix} \end{equation}Multiplying any matrix by an identity matrix is the same as multiplying a number by 1; the result is the same as the original value:
\begin{equation}\begin{bmatrix}1 & 2 & 3 \\4 & 5 & 6\\7 & 8 & 9\end{bmatrix} \cdot \begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{bmatrix} = \begin{bmatrix}1 & 2 & 3 \\4 & 5 & 6\\7 & 8 & 9\end{bmatrix} \end{equation}If you doubt me, try the following Python code!
In [ ]:
import numpy as np
A = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
B = np.array([[1,0,0],
[0,1,0],
[0,0,1]])
print(A @ B)
You can't actually divide by a matrix; but when you want to divide matrices, you can take advantage of the fact that division by a given number is the same as multiplication by the reciprocal of that number. For example:
\begin{equation}6 \div 3 = \frac{1}{3}\times 6 \end{equation}In this case, 1/3 is the reciprocal of 3 (which as a fraction is 3/1 - we "flip" the numerator and denominator to get the reciprocal). You can also write 1/3 as 3-1.
For matrix division, we use a related idea; we multiply by the inverse of a matrix:
\begin{equation}A \div B = A \cdot B^{-1}\end{equation}The inverse of B is B-1 as long as the following equation is true:
\begin{equation}B \cdot B^{-1} = B^{-1} \cdot B = I\end{equation}I, you may recall, is an identity matrix; the matrix equivalent of 1.
So how do you calculate the inverse of a matrix? For a 2x2 matrix, you can follow this formula:
\begin{equation}\begin{bmatrix}a & b\\c & d\end{bmatrix}^{-1} = \frac{1}{ad-bc} \begin{bmatrix}d & -b\\-c & a\end{bmatrix}\end{equation}What happened there?
Let's try with some actual numbers:
\begin{equation}\begin{bmatrix}6 & 2\\1 & 2\end{bmatrix}^{-1} = \frac{1}{(6\times2)-(2\times1)} \begin{bmatrix}2 & -2\\-1 & 6\end{bmatrix}\end{equation}So:
\begin{equation}\begin{bmatrix}6 & 2\\1 & 2\end{bmatrix}^{-1} = \frac{1}{10} \begin{bmatrix}2 & -2\\-1 & 6\end{bmatrix}\end{equation}Which gives us the result:
\begin{equation}\begin{bmatrix}6 & 2\\1 & 2\end{bmatrix}^{-1} = \begin{bmatrix}0.2 & -0.2\\-0.1 & 0.6\end{bmatrix}\end{equation}To check this, we can multiply the original matrix by its inverse to see if we get an identity matrix. This makes sense if you think about it; in the same way that 3 x 1/3 = 1, a matrix multiplied by its inverse results in an identity matrix:
\begin{equation}\begin{bmatrix}6 & 2\\1 & 2\end{bmatrix} \cdot \begin{bmatrix}0.2 & -0.2\\-0.1 & 0.6\end{bmatrix} = \begin{bmatrix}(6\times0.2)+(2\times-0.1) & (6\times-0.2)+(2\times0.6)\\(1\times0.2)+(2\times-0.1) & (1\times-0.2)+(2\times0.6)\end{bmatrix} = \begin{bmatrix}1 & 0\\0 & 1\end{bmatrix}\end{equation}Note that not every matrix has an inverse - for example, if the determinant works out to be 0, the inverse matrix is not defined.
In Python, you can use the numpy.linalg.inv function to get the inverse of a matrix in an array or matrix object:
In [ ]:
import numpy as np
B = np.array([[6,2],
[1,2]])
print(np.linalg.inv(B))
Additionally, the matrix type has an I method that returns the inverse matrix:
In [ ]:
import numpy as np
B = np.matrix([[6,2],
[1,2]])
print(B.I)
For larger matrices, the process to calculate the inverse is more complex. Let's explore an example based on the following matrix:
\begin{equation}\begin{bmatrix}4 & 2 & 2\\6 & 2 & 4\\2 & 2 & 8\end{bmatrix} \end{equation}The process to find the inverse consists of the following steps:
1: Create a matrix of minors by calculating the determinant for each element in the matrix based on the elements that are not in the same row or column; like this:
\begin{equation}\begin{bmatrix}\color{blue}4 & \color{lightgray}2 & \color{lightgray}2\\\color{lightgray}6 & \color{red}2 & \color{red}4\\\color{lightgray}2 & \color{red}2 & \color{red}8\end{bmatrix}\;\;\;\;(2\times8) - (4\times2) = 8\;\;\;\;\begin{bmatrix}8 & \color{lightgray}? & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{lightgray}4 & \color{blue}2 & \color{lightgray}2\\\color{red}6 & \color{lightgray}2 & \color{red}4\\\color{red}2 & \color{lightgray}2 & \color{red}8\end{bmatrix}\;\;\;\;(6\times8) - (4\times2) = 40\;\;\;\;\begin{bmatrix}8 & 40 & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix}\end{equation}\begin{equation}\begin{bmatrix}\color{lightgray}4 & \color{lightgray}2 & \color{blue}2\\\color{red}6 & \color{red}2 & \color{lightgray}4\\\color{red}2 & \color{red}2 & \color{lightgray}8\end{bmatrix}\;\;\;\;(6\times2) - (2\times2) = 8\;\;\;\;\begin{bmatrix}8 & 40 & 8\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{lightgray}4 & \color{red}2 & \color{red}2\\\color{blue}6 & \color{lightgray}2 & \color{lightgray}4\\\color{lightgray}2 & \color{red}2 & \color{red}8\end{bmatrix}\;\;\;\;(2\times8) - (2\times2) = 12\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & \color{lightgray}? & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{red}4 & \color{lightgray}2 & \color{red}2\\\color{lightgray}6 & \color{blue}2 & \color{lightgray}4\\\color{red}2 & \color{lightgray}2 & \color{red}8\end{bmatrix}\;\;\;\;(4\times8) - (2\times2) = 28\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & 28 & \color{lightgray}?\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{red}4 & \color{red}2 & \color{lightgray}2\\\color{lightgray}6 & \color{lightgray}2 & \color{blue}4\\\color{red}2 & \color{red}2 & \color{lightgray}8\end{bmatrix}\;\;\;\;(4\times2) - (2\times2) = 4\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & 28 & 4\\\color{lightgray}? & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{lightgray}4 & \color{red}2 & \color{red}2\\\color{lightgray}6 & \color{red}2 & \color{red}4\\\color{blue}2 & \color{lightgray}2 & \color{lightgray}8\end{bmatrix}\;\;\;\;(2\times4) - (2\times2) = 4\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & 28 & 4\\4 & \color{lightgray}? & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{red}4 & \color{lightgray}2 & \color{red}2\\\color{red}6 & \color{lightgray}2 & \color{red}4\\\color{lightgray}2 & \color{blue}2 & \color{lightgray}8\end{bmatrix}\;\;\;\;(4\times4) - (2\times6) = 4\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & 28 & 4\\4 & 4 & \color{lightgray}?\end{bmatrix} \end{equation}\begin{equation}\begin{bmatrix}\color{red}4 & \color{red}2 & \color{lightgray}2\\\color{red}6 & \color{red}2 & \color{lightgray}4\\\color{lightgray}2 & \color{lightgray}2 & \color{blue}8\end{bmatrix}\;\;\;\;(4\times2) - (2\times6) = -4\;\;\;\;\begin{bmatrix}8 & 40 & 8\\12 & 28 & 4\\4 & 4 & -4\end{bmatrix} \end{equation}2: Apply cofactors to the matrix by switching the sign of every alternate element in the matrix of minors:
\begin{equation}\begin{bmatrix}8 & -40 & 8\\-12 & 28 & -4\\4 & -4 & -4\end{bmatrix} \end{equation}3: Adjugate by transposing elements diagonally:
\begin{equation}\begin{bmatrix}8 & \color{green}-\color{green}1\color{green}2 & \color{orange}4\\\color{green}-\color{green}4\color{green}0 & 28 & \color{purple}-\color{purple}4\\\color{orange}8 & \color{purple}-\color{purple}4 & -4\end{bmatrix} \end{equation}4: Multiply by 1/determinant of the original matrix. To find this, multiply each of the top row elements by their corresponding minor determinants (which we calculated earlier in the matrix of minors), and then subtract the second from the first and add the third:
\begin{equation}Determinant = (4 \times 8) - (2 \times 40) + (2 \times 8) = -32\end{equation}\begin{equation}\frac{1}{-32}\begin{bmatrix}8 & -12 & 4\\-40 & 28 & -4\\8 & -4 & -4\end{bmatrix} = \begin{bmatrix}-0.25 & 0.375 & -0.125\\1.25 & -0.875 & 0.125\\-0.25 & 0.125 & 0.125\end{bmatrix}\end{equation}Let's verify that the original matrix multiplied by the inverse results in an identity matrix:
\begin{equation}\begin{bmatrix}4 & 2 & 2\\6 & 2 & 4\\2 & 2 & 8\end{bmatrix} \cdot \begin{bmatrix}-0.25 & 0.375 & -0.125\\1.25 & -0.875 & 0.125\\-0.25 & 0.125 & 0.125\end{bmatrix}\end{equation}\begin{equation}= \begin{bmatrix}(4\times-0.25)+(2\times1.25)+(2\times-0.25) & (4\times0.375)+(2\times-0.875)+(2\times0.125) & (4\times-0.125)+(2\times-0.125)+(2\times0.125)\\(6\times-0.25)+(2\times1.25)+(4\times-0.25) & (6\times0.375)+(2\times-0.875)+(4\times0.125) & (6\times-0.125)+(2\times-0.125)+(4\times0.125)\\(2\times-0.25)+(2\times1.25)+(8\times-0.25) & (2\times0.375)+(2\times-0.875)+(8\times0.125) & (2\times-0.125)+(2\times-0.125)+(8\times0.125)\end{bmatrix} \end{equation}\begin{equation}= \begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{bmatrix} \end{equation}As you can see, this can get pretty complicated - which is why we usually use a calculator or a computer program. You can run the following Python code to verify that the inverse matrix we calculated is correct:
In [ ]:
import numpy as np
B = np.array([[4,2,2],
[6,2,4],
[2,2,8]])
print(np.linalg.inv(B))
Now that you know how to calculate an inverse matrix, you can use that knowledge to multiply the inverse of a matrix by another matrix as an alternative to division:
\begin{equation}\begin{bmatrix}1 & 2\\3 & 4\end{bmatrix} \cdot \begin{bmatrix}6 & 2\\1 & 2\end{bmatrix}^{-1} \end{equation}\begin{equation}=\begin{bmatrix}1 & 2\\3 & 4\end{bmatrix} \cdot \begin{bmatrix}0.2 & -0.2\\-0.1 & 0.6\end{bmatrix} \end{equation}\begin{equation}=\begin{bmatrix}(1\times0.2)+(2\times-0.1) & (1\times-0.2)+(2\times0.6)\\(3\times0.2)+(4\times-0.1) & (3\times-0.2)+(4\times0.6)\end{bmatrix}\end{equation}\begin{equation}=\begin{bmatrix}0 & 1\\0.2 & 1.8\end{bmatrix}\end{equation}Here's the Python code to calculate this:
In [ ]:
import numpy as np
A = np.array([[1,2],
[3,4]])
B = np.array([[6,2],
[1,2]])
C = A @ np.linalg.inv(B)
print(C)
One of the great things about matrices, is that they can help us solve systems of equations. For example, consider the following system of equations:
\begin{equation}2x + 4y = 18\end{equation}\begin{equation}6x + 2y = 34\end{equation}We can write this in matrix form, like this:
\begin{equation}\begin{bmatrix}2 & 4\\6 & 2\end{bmatrix} \cdot \begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}18\\34\end{bmatrix}\end{equation}Note that the variables (x and y) are arranged as a column in one matrix, which is multiplied by a matrix containing the coefficients to produce as matrix containing the results. If you calculate the dot product on the left side, you can see clearly that this represents the original equations:
\begin{equation}\begin{bmatrix}2x + 4y\\6x + 2y\end{bmatrix} =\begin{bmatrix}18\\34\end{bmatrix}\end{equation}Now. let's name our matrices so we can better understand what comes next:
\begin{equation}A=\begin{bmatrix}2 & 4\\6 & 2\end{bmatrix}\;\;\;\;X=\begin{bmatrix}x\\y\end{bmatrix}\;\;\;\;B=\begin{bmatrix}18\\34\end{bmatrix}\end{equation}We already know that A • X = B, which arithmetically means that X = B &div A. Since we can't actually divide by a matrix, we need to multiply by the inverse; so we can find the values for our variables (X) like this: X = A-1 • B
So, first we need the inverse of A:
\begin{equation}\begin{bmatrix}2 & 4\\6 & 2\end{bmatrix}^{-1} = \frac{1}{(2\times2)-(4\times6)} \begin{bmatrix}2 & -4\\-6 & 2\end{bmatrix}\end{equation}\begin{equation}= \frac{1}{-20} \begin{bmatrix}2 & -4\\-6 & 2\end{bmatrix}\end{equation}\begin{equation}=\begin{bmatrix}-0.1 & 0.2\\0.3 & -0.1\end{bmatrix}\end{equation}Then we just multiply this with B:
\begin{equation}X = \begin{bmatrix}-0.1 & 0.2\\0.3 & -0.1\end{bmatrix} \cdot \begin{bmatrix}18\\34\end{bmatrix}\end{equation}\begin{equation}X = \begin{bmatrix}(-0.1 \times 18)+(0.2 \times 34)\\(0.3\times18)+(-0.1\times34)\end{bmatrix}\end{equation}\begin{equation}X = \begin{bmatrix}5\\2\end{bmatrix}\end{equation}The resulting matrix (X) contains the values for our x and y variables, and we can check these by plugging them into the original equations:
\begin{equation}(2\times5) + (4\times2) = 18\end{equation}\begin{equation}(6\times5) + (2\times2) = 34\end{equation}These of course simplify to:
\begin{equation}10 + 8 = 18\end{equation}\begin{equation}30 + 4 = 34\end{equation}So our variable values are correct.
Here's the Python code to do all of this:
In [ ]:
import numpy as np
A = np.array([[2,4],
[6,2]])
B = np.array([[18],
[34]])
C = np.linalg.inv(A) @ B
print(C)
In [ ]: